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1. INTRODUCTION 

Potato (Solanum tuberosum L.) is one of the world's most important crops along with wheat, rice, 
and maize because it is grown in almost all types of climates in the world [1]. This agricultural crop 
originated in South America, but its consumption is widespread everywhere on the planet due to its high 
nutrient benefit, which includes a high proportion of starch, dietary fiber, crude protein, and a range of vital 
vitamins and minerals [2]. With the increasing global population growth, the demand for agricultural 
products is also increasing, forcing agricultural experts and farmers to look for new ways to increase crops 
and meet the needs of consumers. 

In the past and until now, farmers have resorted to using farm animal waste, such as goats, sheep, 
cows, and donkeys, as organic fertilizers, which gave very good results, despite the fact that the inordinate 
utilization of these compounds has a harmful impact on the air and groundwater [3]. With the increase in the 
demand for meat as well, and farmers resorting to fattening farmyard animals with medicines [4], [5], 
fertilization with the waste produced from these animals has become a great danger to the health of the 
consumer because the medicines contain heavy metals that were found [6], according to a study that was 
done, in agricultural products [7]. 

Returning to the potato, the subject of our study, the latter, due to the large demand for it, is also 
exposed to the use of fertilizers in order to increase its yield and meet the increasing demand for it. To avoid 
the negative repercussions resulting from the consumption of treated potatoes, many studies have been 
conducted based on the smell of potato samples in order to distinguish between treated and untreated tubers 
using heavy and expensive equipment such as gas chromatography (GC). This technique was employed in 
conjunction with a variety of detectors, including nitrogen phosphorus detector (NPD), mass spectrometry 
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(MS), flame photometric detector (FPD), flame ionization detector (FID), and electron capture detector 
(ECD) [8]. 

To make the matter of distinguishing between consumables easy, a multi-sensor device was used 
that simulates the biological nose in its structure, a simple and inexpensive system, called the electronic nose 
[9], [10]. It has been used in many studies related to agricultural products and has proven its ability to 
distinguish between treated and untreated, to mention but not limited to, such as mint [11]-[13], tea [14], 
apples [15], and cherry tomatoes [16]. 

In the same context comes our study, which aims to distinguish between potato samples derived 
from soil that has traditionally been treated with sheep and donkey manure that have never been fattened 
from those derived from soil treated with fattened chicken manure on the basis of smell and using an 
electronic nose. The remainder of the article is arranged as below: the initial part (Materials and Methods) 
discusses the electronic nose design and measurement protocol followed by the sample preparation process, 
and finally, the data analysis part and machine learning algorithms used. The findings of the experiments will 
be presented and discussed in the second part (results and discussion), beginning with the analysis of sensor 
reaction and progressing to the results of the employment of machine learning methods. Finally, we will 
conclude this article. 


2. MATERIAL AND METHODS 
2.1. Electronic nose design and measurement protocol 

This research looks at the ability of an electronic nose to identify potato samples depending on the 
treatment they have undergone: traditionally treated with sheep and donkey manure that have never been 
fattened or treated with fattened chicken manure. Our homemade instrument is made up of three major 
components: the sensor array, the data collection, and the data analysis. The sensor array is mostly composed 
of the five-metal oxide (MOX) sensors listed in the Table 1. These sensors are installed on a PCB and have 
potentiometers to control the voltages provided. For the collection of data, is carried out utilising the DAQ 
1901USB card and a program in LabVIEW software. The designed electronic nose's hardware comprises a 
ventilator with a fixed flow, a sample bottle, a sensor chamber, the DAQ 1901USB card, and a computer. 
Figure 1 depicts the measurement procedure schematically. 


Table 1. The sensors utilized in the electronic nose system 


Sensor Target gas 

MQ-7 Carbon monoxide 

MQ136 Hydrogen sulfide, ammonia, Air, and carbon 
TGS821 Hydrogen 
TGS822 Vapors of organic solvents such as ethanol 


TGS2620 Volatile organic vapors 


Airflow Potato scented airflow 
ae lab 


Ventilator 


3B. Data acquisition 
card 


Sample Sensors 
chamber chamber 


Figure 1. Measurement tool 


Regarding the measurement process, the experiments begin by placing the samples in the sample 
chamber first and then operating the ventilator that produces the airflow, as the latter reaches the sample 
chamber and works to transfer the air enhanced with the volatile organic components of the potato sample 
subject of the experiment to the sensors chamber, which interacts by producing voltage depends on their 
sensitivity to smell. These reactions are gathered using the DAQ 1901 USB acquisition card and registered in 
an Excel file by means of a program previously created in LabVIEW for post-processing by machine learning 
algorithms. 
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2.2. Sample preparation 

In this study, samples of potato tubers were used from two fields of the same place 
(33°53'42"N 5°33'17"W Meknes Morocco) with different fertilization methods, a field that was fertilized in 
the traditional way based on the manure of farm animals from sheep and donkeys, and a field that was 
fertilized using the manure of fattened chickens. The best and finest samples were selected, while samples 
that were immature or had chlorophyll spots, or showed evidence of disease were excluded, even those that 
were destroyed in the course of harvesting. The selected samples were washed, dried, and then cut into 
French fries. A weight of 100g from each sample type was placed in each 1L glass Erlenmeyer flask before 
being enclosed, labelled, and stored for seven days at a temperature of 25+2°C. Figure 2 depicts ones of the 
samples exploited in this research. 


Figure 2. Picture of some samples 


2.3 Data analysis 

Data preprocessing is the manipulation of data in order to prepare and transform it into the most 
appropriate form for analysis. This is a very important and necessary step for structuring the data because 
proper pre-processing can create a difference between a useful and useless model as well as improve 
performance. After the raw sensor responses are collected, they are converted from their analog forms into 
digital signals that can be interpreted by a computer. Then the data is centered in (1) [17], extracting and 
selecting key features, and eliminating the scaling effect by normalization. In the present study, we used the 
column normalization method [18], which involves dividing each column by its greatest value (2). 


— Vn-Vo 
y= a) 


where V, denotes the resulting relative voltage, Vm is the measured voltage, and Vo is the initial voltage. 


= ži 
J  Max(Xi) 


(2) 


X;j represents the ith sample of the jth sensor, while X; contains all of the p responses for the ith 
sample's sensors. Following that, the data is ready for multivariate analysis. 


2.3.1. K-nearest neighbors algorithm 

First proposed by Silverman and Jones [19] in 1951 and modified by Cover in 1967 [20], the 
k-nearest neighbours (KNN) method is a simple and effective technique that assigns the input sample to the 
class that the majority of the k samples closest to the training set belong to [21]. With a limited samples size, 
this supervised nonparametric method could indeed perform nonlinear classification. 

The choice of the key elements, namely the number of neighbors K and the distance metric D, 
affects the performance of this approach. For distance measurement, although a variety of distance measures 
can be used to calculate the distance between two points, the Euclidean distance is the most used metric, it is 
calculated as (3): 


D = V Xiz1(qi — p)” (3) 


2.3.2. Support vector machines algorithm 

This technique initiated by Vapnik [22] in 1995 is a supervised learning algorithm that uses the 
linear or nonlinear kernel function to classify data; this function tries to distinguish classes using hyperplanes. 
classifies data using the linear or nonlinear kernel function, this function tries to distinguish classes by hyper- 
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planes. hyperplanes are found such that they do not include interior data and provide maximum headroom 
between class pairs. The data points located on these hyperplanes are called support vectors. The hyperplane 
that passes by the center of the maximum distance separating the two said hyperplanes is the best result, here 
equation is given by: 


yx) = wl p(x) = Yi, wW) wo = 0 (4) 


where g(x) = [po x) p1 (x), 5 Dx (x)]" with @ (x) =1 and w is the network's weight vector 
w = [Wo, W1» -.-, Wg ]”. If the data vector x meets the criterion y(x) > 0, it is considered a member of the first 
class, and if y(x) < 0 it is considered a member of the opposite class [23]. 


3. RESULTS AND DISCUSSION 
3.1. Sensor responses 

In order to distinguish the samples of potatoes according to their origins, we took in our experiment 
samples according to the treatment they underwent: samples treated with chicken manure and samples 
traditionally treated with donkey and sheep manure, this is the most common treatment. Data from potato 
samples were collected in three steps: In the first stage, the sensor room was ventilated for 10 minutes to 
clean the sensors and reach baseline; in the second stage, the sensor room was closed, and the stability of the 
sensor responses was checked for 30 seconds; in the third and final stage, the potato headspace was 
incorporated into the tools and subjected to an airflow emitted by the fan and it was injected for 8 minutes 
into the sensor chamber, then the experiment is repeated. It is noteworthy that each type of sample was 
subjected to 20 experiments, bringing the total to 40 experiments. An example of the recorded sensor 
responses is shown in Figure 3. 

According to these responses, the first thing we noticed was that the intensities of the sensors' 
responses increased once the measurement was triggered, which proves the sensors’ ability to recognize the 
samples' existence and the change in the composition of the smell provided in the sensor chamber. Also, by 
observing the responses, we noted that the responses relating to the samples of potatoes treated with chicken 
manure are in most cases greater than the responses of the samples of potatoes treated traditionally. 
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Figure 3. TGS-2620 responses 


3.2. Features extraction 

Raising features requires careful examination of the response figures, taking into account the 
purpose of the study. In this study, the aim is to distinguish between groups of samples according to the type 
of fertilization. In this context, we have identified some features that will inevitably assist us to reach the 
desired goal: i) maximum value (Vmax), ii) signal-occupied area (area), iii) fixed signal value (MS) from 
450 to 500 seconds. As a consequence, our data matrix will contain 15 columns (3 features * 5 sensors) and 
40 rows (20 experiments * 2 types of potatoes). 
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3.3. Classification of potato samples with the KNN and the SVM methods 

Support vector machines (SVM) and KNN methods are the most useful classification algorithms, 
they are supervised classification methods in which the groups of training data are predetermined [24]. To 
prevent overfitting and obtain an accurate classification, the 5-fold cross-validation method was exploited to 
evaluate each model. This means that the data set is randomly divided into 5 folds of roughly the same size, 
where each time one is used for testing and the others for training until all folds have passed testing and 
training. 

For the SVM, the linear kernel is employed; it is among the most commonly utilized kernels. As for 
KNN, the classification was performed with a neighbor count of 5, the distance metric was Euclidean, and 
the distance weight was equal. The two methods produced the same result shown in Figure 4. The confusion 
matrix is utilised to compute the detection models' performance parameters. The first group of potatoes 
traditionally treated has a success rate of 95%, while the second group of potatoes treated with chicken 
manure had a 100% success rate. Thus, they achieved an overall result of 97.5% success. 

To get a good idea of the data, we chose to view them in a two-dimensional plane, taking as the 
horizontal axis the stabilized value obtained from the MQ-7 sensor and as the vertical axis the maximum 
value obtained from the TGS-822 sensor. The result is shown in Figure 5. According to Figure 5, individuals 
from each group of samples were correctly allotted to their data classes, with the exception of one sample 
from the group of potatoes traditionally treated, which was misclassified. Nonetheless, the success rate was 
excellent. Considering this very good result, and in comparison, with those obtained with other materials 
which are heavy, difficult to handle, and onerous such as gas chromatography-mass spectrometry (GC-MS) 
[25], [26], Headspace solid-phase _microextraction/gas | chromatography—mass spectrometry 
(HS-SPME/GC-MS) and isotope ratio mass spectrometry (IRMS) [27], [28] and near-infrared reflectance 
spectroscopy (NIR) [29], [30], used in different applications related to the study of potatoes, we can say that 
our electronic nose which is made up mainly of commercial gas sensors could be used as an efficient device 
for the classification of potatoes according to their crop fields. 
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Figure 4. confusion matrix of the algorithms used 
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Figure 5. The KNN and SVM result for the classification of potatoes according to their cultivated field 
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4. CONCLUSION 

The consumption of food treated with manure from medication-fattened poultry represents a serious 
danger to the health of consumers, it is with this in mind that we decided to manufacture an electronic nose 
made up mainly of commercial gas sensors for monitoring the quality of potatoes by distinguishing the 
samples treated with this chicken manure. Using two machine learning algorithms in this case KNN and 
SVM with 5-fold cross-validation we arrived at a success rate of 97.5% in the classification of the samples, 
which means our homemade tool is able to distinguish between potatoes traditionally treated with the sheep 
and donkeys’ manure and potatoes treated with the chicken manure. Without forgetting that the strong points 
of our tool lie in its modest cost and its simple design. 
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